A Semantically Compositional Annotation Scheme for Time Normalization

نویسندگان

  • Steven Bethard
  • Jonathan Parker
چکیده

We present a new annotation scheme for normalizing time expressions, such as three days ago, to computer-readable forms, such as 2016-03-07. The annotation scheme addresses several weaknesses of the existing TimeML standard, allowing the representation of time expressions that align to more than one calendar unit (e.g., the past three summers), that are defined relative to events (e.g., three weeks postoperative), and that are unions or intersections of smaller time expressions (e.g., Tuesdays and Thursdays). It achieves this by modeling time expression interpretation as the semantic composition of temporal operators like UNION, NEXT, and AFTER. We have applied the annotation scheme to 34 documents so far, producing 1104 annotations, and achieving inter-annotator agreement of 0.821.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An annotation scheme for Persian based on Autonomous Phrases Theory and Universal Dependencies

A treebank is a corpus with linguistic annotations above the level of the parts of speech. During the first half of the present decade, three treebanks have been developed for Persian either originally or subsequently based on dependency grammar: Persian Treebank (PerTreeBank), Persian Syntactic Dependency Treebank, and Uppsala Persian Dependency Treebank (UPDT). The syntactic analysis of a sen...

متن کامل

CUILESS2016: a clinical corpus applying compositional normalization of text mentions

BACKGROUND Traditionally text mention normalization corpora have normalized concepts to single ontology identifiers ("pre-coordinated concepts"). Less frequently, normalization corpora have used concepts with multiple identifiers ("post-coordinated concepts") but the additional identifiers have been restricted to a defined set of relationships to the core concept. This approach limits the abili...

متن کامل

Semantic annotation to characterize contextual variation in terminological noun compounds: a pilot study

Noun compounds (NCs) are semantically complex and not fully compositional, as is often assumed. This paper presents a pilot study regarding the semantic annotation of environmental NCs with a view to accessing their semantics and exploring their domain-based contextual variation. Our results showed that the semantic annotation of NCs afforded important insights into how context impacts their co...

متن کامل

An Image Database Semantically Structured based on Automatic Image Annotation for Content-Based Image Retrieval

In this paper, we presented a semantically structured image database for content-based image retrieval. A class descriptor is proposed to represent each class using a multiprototype model, which can be obtained by using a learning scheme, such as the Unsupervised Optimal Fuzzy Clustering algorithm, on a group of sample images manually selected from the class. Based on the proposed Image-Class M...

متن کامل

HeidelTime: High Quality Rule-Based Extraction and Normalization of Temporal Expressions

Different types •Date: On May 22, 1995, Farkas was ... •Time: ... in Brownsville around 7:15 p.m. •Duration: He spent six days abroad ... •Set: ... for liver transplants each year ... Different occurrences in documents • explicit easy to normalize • implicit knowledge is needed • relative reference time is needed (& additional information) Annotation scheme •TimeML: ISO standard for temporal an...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016